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the addition ofa further media specific identifier. For this book 
publicationitis the first edition print page number. For other media types 
the identifier could beamedia specific identifier, for example a timecode 
for the case of audio orvideo publications. The HPC considers this trial 
citation identifier as an emergent de facto standard. The historical 
precedent being in the commentaries on Plato following the 1578 
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http://plato-dialogues.org/stephanus.htm 
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1.INTRODUCTION 
The Hybrid Publishing Consortium (HPC) is aresearchnetwork 
whichis part of the Hybrid Publishing Lab and works to support 
Open Source software infrastructures. The HPC wishes to present 
practical solutions to the problems with the currentstage ofthe 
evolution of the book. The HPCseesa glaring necessity fornew 
types of publications, books which are enhanced with interfaces 
in order to take advantage of computation and digitalnetworks. 
The initial sections of this manifesto will outline the current 
problems with the digital development of the book, with 
reference to stages inits historical evolution. We will then go on 
to presenta framework for dealing with the problems in the later 
sections. 

Now that there are floods of Open Access content for users 
tosort through, the book must develop to take on fresh interface 
design challenges—forimproving reading, but also tosupporta 
wide range of communities. The latter include art, design, 
museums and the Digital Humanities groups, for all of whom 
video, audio, hyper-images, code, text, simulations and game 
sequences are needed. 

HPC’sviewis that current technology provisionsin 
publishing are costly, inefficientand need astep-upin R&D. 
Tosupport technical, opensource infrastructures for publishing 
we have identified the ‘Platform Independent Document Type’ 
as key. Our objective is to contribute to the working 
implementation of an open standards based and transmedia 
structured document for multi-format publishing. With 
structured documents and accompanying systems publishers 
can lower costs, increase revenues and supportinnovation. 

HPCis about building public open source software 
infrastructures for publishing to support the free-flow of 
knowledge -aka book liberation. Our mission statementis: 

‘Every publication, in auniversal format, available 

for free in real-time’ 

Thisis our reworking of Amazon’s mission statement forits 
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Kindle product: 

‘Every book ever printed, in any language, 

allavailable in less than 60 seconds’ 
Currently digital publishing is dead in the water because for 
digital multi-format publications prohibitive amounts of time 
and costs areneeded for rights clearance: the permissions 
required for each new format, the necessary signed contracts etc. 
Sosomething has to give. For thescholarly community, Open 
Access academic publishing has fixed these problems with open 
licences, but other publishing sectors outside of academia 
remain frozen by restrictive licensing designed for printmedia. 

Our efforts in building technical infrastructures will be 
wasted ifcontent continues to be locked in, and thisis where 
HPC’s issue becomes as mucha political asa technical problem. 
Open intellectual property licences, suchas Creative Commons, 
are not enough on their own. Something else isneeded if we want 
tosupport the free flow of knowledge: a way to financially 
support the publishers and the chain ofskilled workers whoare 
involved in publication productions. This can be either bya form 
of market metrics or by fair collections and redistribution 
methods, with the latter involving alittle less fussing around than 
some market measurement. Open Access has meant publishers 
are still paid; itis simply that the point of payment hasmoved 
away from the reader to another pointin the publishing process, 
where the free flow of knowledge is nothampered. 
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2. BROKEN WORKFLOWS 
Publishing is the largest creative industry in terms ofrevenue. 
For example, in the EU there are 64,000 publishers with total 
annual revenues of 23€ billion. ' The top 20% of publishers 
generate 80% ofthe revenues which, ifEU figures aretakenasa 
guide, means theyare slicing offacool 18€ billion annually. In 
the EU wealthier publishers can afford digital workflow systems 
which are prohibitively expensive for others, starting at 100,000 
euro per annum in end-to-end costs. The 51,000 publishers who 
make up bottom 80%, with average revenues of less than three 
million euro, get by with various hand cranked custom solutions. 
Itis thesesmaller publishers and all the self publishers, authors 
and institutions that weneed to help. 

Ahighend digital publishing system involves workflow 
integration and dynamic publishing features: multi-format 
publishing; standardised markup; rights management; asset 
management; reading metrics; automated distribution; 
metadata management; revisioning; document management; 
and payment systems, etc. Itisnotable that highend 
publishing systems continue to rely for many of these 
processes on offshored cheap labor. 

Take one part of the workflow, multi-format digital 
publishing, which involves publishing to eBook, HTML, PDF, 
Appand XML or other markup. Each of these formats has to be 
‘Publication Ready Output’ for each distribution channel, which 
involves more than merely making the appropriate file type per 
format. Wecansee that the problems here are multi-format 
design layouts and revisioning. Currently atypical publisher 
would use a tool chain most likely comprising Microsoft Word 
and Adobe Creative Suite, neither of which are capable of making 
layout designs and handling revisions for multi-format in any 
practical or efficient manner. In this conventional scenario the 
workflow for each formatrequires aseparate workflow for 
layout design, adding a new cost overhead for each format. And 
then on theside of revisioning, for example adding last minute 
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edits, the current tool chain, again, involves each format beinga 
separate workflow, so that updating asimple typo means editing 
four or five different files, which adds costs and drives editors 
crazy. These two factors alone-—out of many more-make the 
digital publishing workflow uneconomic and unviable for the 
publisher. Thenet resultis that publishers miss out on revenues 
and find itnearly impossible to entertain thoughts of innovating 
their processes or product lines. 

There are many newonline services with better and more 
integrated workflows, but theyneed moresupportin terms of 
developmentto reach maturity before publishers switch 
systems. Therisks to the publishers of these new services is they 
willclose down, due toinsufficiently robust technology, or 
because of other problems, which means they arenot viable for 
an industry with hard and fixed deadlines. 

Itisnot solely key applications that are letting publishers 
down, itis also standards and technologies. Itremains the case 
that common standards for document markups like HTMLand 
EPUB cannot properly cope with basic publication components, 
including footnotes, styling inrunning headers, pagination and 
annotation. Infrastructural technologies are also unavailable as 
public services for reuse by individuals or businesses: examples 
include public websearch engine indexes, cost free 
micropayment and Optical Character Recognition. Each of these 
areas does feature attempts and programmes to address the 
issues, but these have serious flaws or unresolved problems. 
Specifically, a public search engine index is only just being 
proposed in the EUas the Open Web Index” as proposed by Dirk 
Lewandowski ofthe Hamburg University of Applied Sciences, for 
micropayments BitCoin remains unviable while its value is so 
volatile, due to lack of regulation over speculators, while OCR 
projects like Google’s ‘Tesseract OCR’® involve Google 
maintaining privacy over its word pattern recognition for 
scanning in Google Books. 


BROKEN WORKFLOWS 


Nevertheless there aremany groups working onimprovementon 
avariety of areasin the technology tool chainas public 
infrastructure: W3C, International Digital Publishing Forum 
(IDPF), research councils and knowledge infrastructure groups; 
Deutsche Forschungsgemeinschaft (German Research 
Foundation, or DFG), Jisc, foundations and, mostimportantly, 
Open Source initiatives (e.g., The Libre Graphics Meeting) and 
the startup sector. 
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A specimen sheet of typefaces and languages, 
by William Caslon I, letter founder, c. 1728. 
https://en.wikipedia.org/wiki/File:Caslon- 
schriftmusterblatt.jpeg 
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3. THE POOR BOOK 
Industry pressures have led digital publishing to create a poor 
simulacrum ofthe book form -notably the eBook, which 
degrades or completely loses the typographic or mnemonic 
qualities ofthe paper book: pagenumber, folios, speed of 
browsing, typographic detail of fonts and kerning etc. The 
typographic, navigational and other conventions of moveable 
type print have been contributed over the centuries by many 
anonymous printers, clerics and publishers. The book hasnever 
been a fixed entity but instead has evolved, normally acquiring 
improvements in the process, yet in the technology environment 
of the last forty years this process seems to have been reversed. 
Looking at the typographic craft and artin thesampleillustration 
of multilingual typesetting below—English, Hebrew, Greek and 
Arabic- which dates from 1728 andis by the letter founder 
William Caslon, itis clear thata current e-ink reader would be 
hard pushed to equal this level of typographic quality or, more 
specifically, to render the language glyph sets and the letter 
spacing to aid reading. So, oncemore, acompany like Amazon 
might have a mission statement for its Kindle product, ‘Every 
book ever printed, in any language, allavailable in less than 60 
seconds’, but the key questions are those of what they will look 
like and howthey will work for the reader. 

If our standpoint was that of four decades ago then the 
technology companies and research funders could be excused for 
not addressing the raft of outstanding fundamental technology- 
designissues concerning publishing, books and reading. 
Unfortunately these issues have been poorly addressed since 
then, despite the ensuing technological advances. In 1974 the 
basics of the personal computer, tabletand networking were still 
challenges only just being overcomein terms of processing 
power, technologies for high quality displays, functional 
programming languages, standardised protocols etc. But these 
issues were mostly resolved twenty years ago—and with Moore’s 
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The DynaBook was first described by Kay in 1968 
and then written up in a paper 1972, ‘A Personal 
Computer for Children of All Ages’. 
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law of processor exponential improvement the future should not 
have been difficult to plan for. 

Moving on from the basics of the book and looking at 
what the book could aspire to, what happened more than forty 
years ago alongside the invention of the personal computer by 
Alan Kay and the Learning Research Group (LRG) at Xerox 
Palo Alto Research Center (PARC), was the creation of the idea 
of the DynaBook (which first appeared ina paper entitled A 
Personal Computer for Children of All Ages).* DynaBookis 
short for ‘Dynamic Book’. Essentially Kay and the team at LRG 
invented the personal computer and tablet, which Steve Jobs 
then copied and soldas the Apple Computer and, later, iPad. 
What failed to happen, and what Apple and others didn’t pick 
up on, was the vision which lay behind Dynabook to enhance 
the book by understanding how people learn. 

The DynaBook encapsulates an experience whichis more 
active than passive, providing us witha better ‘book’. Moreover 
for Kay the personal computer and ideas about whata book could 
become were rooted in an understanding of the technology based 
around McLuhanesquenotions. Kay couldsee thatindustry 
trends led to computers being designed with much of their 
content adopted from previous media, the metaphor ofthe page 
in GUIs for example from print, with networked computers’ own 
attributes only just beginning to be discovered. 

Kay’s other contribution to the idea of the book was the 
‘Active Essay’, a publication that includes computational objects, 
thatis, essays containing textand executablecode torun 
simulations. One example is outlined in ‘Active Essays on the 
Web’, [Takashi Yamamiya, 2009 
(http://wwwyypri.org/pdf/tr2009002_active_essays.pdf); 
Yamamiya has beena member of the Viewpoint Research 
Institute founded by Alan Kay.] The current web version of the 
Active Essay runs as Chalkboard 
(http://tinlizzie.org/chalkboard/#Home). 

Interestingly the technologies of the last eight years ofa 
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post-LAMP (Linux, Apache, MySQLand Perl) model-delivering 
Javascriptand real-time browser updating and template 
libraries, like Angular.js, as wellas template libraries like 
Bootstrap, NoSQL speedier scalable database delivery, together 
with instant cloud deployment—have meant that the ideas of the 
DynaBook and the Active Essay can start to come into play. This 
has been accompanied witha change in people’s expectations 
towards demanding dynamic interface environments. 
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4. THE UNBOUND BOOK 
Transmedia publishing in a post-Open Access context will 
include new kinds of media and masses of open licensed content 
(assuming the blockage of rights clearance is removed), with 
granular contentreuse made possible from Open Access 
repositories. This has led us to re-examine the conventional 
book. To understand the book and translate it for 
computational use, we have found that we have had toatomise 
it, breaking it down intoits smallest parts, before rebuildinga 
computable representation. This means creating a structured 
tree of components with meta descriptions, that is able to 
integrate many external and constantly changing datasources. 

Once the bookis broken apart then the publishing 
architecture that stems from the conventions of knowledge 
institutions and the divisions of labor come under examination. 
They includearchives, education, research and library. Witha 
freehand to recombine models from these areas of knowledge 
management, wesee that the visions of the book developedin the 
past, as well as material from information science histories, lend 
inspiration when examining the basic limitations of current 
Internet technologies, HTTP and HTMLetc. currently being 
adopted for the development of the digital book. 

The development of the conventional book has been closely 
accompanied by parallel experiments with the unbound book 
form, essentially what became the library card catalog. Inthe 
2011 book ‘Paper Machines’° the media historian Markus 
Krajewski traces a history of the European unbound book 
beginning as library records in the sixteenth century, as created 
by Swiss librarian Konrad Gessner, on to Leibniz inthe 
seventeenth century using the scholars’ cabinet of quotes and 
references, the on to the US Dewey Decimal System of the 
nineteenth century and thence to transfer of the library record 
keeping system to businesses as the card index system in the early 
twentieth century. Key figures that bridge the transition from the 
‘universal paper machine’ (Krajewski’s term) to the ‘digital 
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Top left: Hybrid card index in book form. 
(From Placcius 1689, p. 67.) 

Bottom left: The hook in the excerpt cabinet. 
(From Placcius 1689, p. 155.) 

Right: Leibniz 17C scholar’s cabinet of quotes 


and references. Excerpt cabinet. (From Placcius 
1689, p. 152.) Leibniz’s method of the scholar’s 
box combines a classification system with a 
permanent storage facility, the cabinet. Soina 
way this is similar to the use of Zotero or other 
citation management systems, but instead uses 
loose sheets of paper on hooks. The strips are 
hung on poles or placed into hybrid books. 
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universal machine’ of the computer are Melvil Dewey and Paul 
Otlet. Dewey is renowned for his nineteenth century American 
dreams of universal access to knowledge. Less well knownis the 
work ofthe Belgium librarian Paul Otletin the early twentieth 
century with his pre-Internet, global paper packet Internet-—or 
‘electric telescopes’ as he described them — using telegraphs, 
early TV andaudio radio. 

This history helpsin part to define the unbound book. To 
gaina fuller definition weneed to add the agent of digital- 
disruption or—according to the term coined by economist Joseph 
Schumpeter - of ‘creative destruction’. ° ‘Creative destruction’ is 
aprocessin which anew economy emerges out of the destruction 
ofaprevious order. Technology innovation is the agent of this 
change, and Schumpeter describes the entrepreneur as the one 
whoexploitsit. 

Ironically itis the unbound book andits prodigy, the card 
index system, that led to the punch card, the early data packet of 
whatarenow packetnetworks. The packet network is where all 
media can be broken down into common data packets and sent to 
any device. Itis this technology that acts as the agent of creative 
destruction, that makes up basic Internet and mobile networks, 
and has made the concept of the unbound book finally realisable. 
Itis the scaling up of this function of packetnetworks which 
means that the fundamentals of publishing are in flux. The 
innovation of the unbound book and the replacement of print 
booksin many contexts means that economic models crumble, 
institutions of knowledge lose their relevance, and copyright 
laws become unworkable and act as an impediment to 
knowledge dissemination. 

Dewey and Otlet both point to inspirational visions ofthe 
unbound bookas the knowledge components of universal 
libraries, visions which embody ambitions to make the world’s 
knowledge universally available. Dewey is more famous, with 
hismechanism ofthe classification system and the card system, 
immersed in the Taylorist obsessions of efficiency, speed and 
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Top: Dewey’s scheme, displayed by The Bridge. 
(From Buhrer and Saager 1912, p. 4.) 

Bottom: Dewey’s scheme, displayed by the Institut 
International de Bibliographie. (From Institut 
International de Bibliographie 1914, p. 45.) 
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Top: Paul Otlet, Traité de Documentation, 1934, p.41. 
Bottom: Paul Otlet, 1934, vision of universal 
knowledge systems. 
http://mundaneumpaulotlet.tumblr.com/ 
http://www.flickriver.com/photos/marcwathieu/set 
s/72157623466540563/ 
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economies of scale. Otlet was lost to obscurity in the chaos of 
World War Two, but had imagined and built elaborate index card 
knowledgesystem, again asa vision ofthe global library. Otlet 
has onlyrecently been re-discovered, for examplein the book by 
Alex Wright ‘Cataloging the World: Paul Otlet and the Birth of the 
Information Age’, 2014. ’ 
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5. DESIGNING THE BOOK OF THE FUTURE 
Ifwe think about newscreen interfaces for publications, wecan 
start by considering the readers and how tohelp them read: to 
assist them to remember, explore, experience, gloss, browse, 
reuse and rewrite. We can enhance what they may have already 
learned to do with paper books. The new interface, supported by 
real-time available open IPR content, can have references inline, 
with full copies of any publication mentioned, as well as 
highlighting of the sections the reader is interested in. In fact, any 
media cited or referenced should be available in full for 
transmedia publication, beita game sequence, an exact time 
stamped pointina videoclip, ascalable 3D model, adata 
calculation orasimulation, etc. 

Torecap, the most high profile impediment to the 
transmedia publication has been copyright. In the Open Access 
model there is noneed toclear rights, because it has already been 
done via an open licence. Untilthis happens more generally 
transmedia publications will remain anon-starter. As well as 
abolishing rights clearance the point of payment must move up 
the value chain. 

The two other hurdles to the transmedia publicationare, 
first, reader expectations and, second, technology. Both have 
turned around one hundred and eight degrees since the advent of 
the precursors in the journey of the digital publication, Hypertext 
and Multimedia, which appeared in the 1980s and 1990s 
respectively. Now users expect real-time updating interfaces and 
are disappointed when when theyare absent. Technologies for 
interfacesnow have Javascript for richinteractivity, design 
frameworks are templated, and standards allow forsystem 
media transfer and communication automatically via 
Application Programme Interfaces (APIs). 

If weare thinking about the new publication design, the 
reader as receiver or consumer is only onerole to consider. We 
must also examine many of the other rolesin the lifecycle ofa 
publication: the librarian, writer, designer, editor, tutor etc. 


23 


BOO} EO LHS FULU}9 


As anexamplein the area of the writer and editor, real-time 
collaborative text editors—GDocs, Fidus Writer, Etherpad, 
Ethertoff—change the skill set of the user, change theinterface of 
the publication from read only to read/write, andso intervenein 
theintimacy of theact of authoring. As Kenneth Goldsmith 
explored in his book Uncreative Writing®, in this publication 
lifecycle there is also the role of machinic writing and 
interventions to consider. Inthe example of real-time 
collaborative texts the key algorithm is called Operational 
Transformation, ’ essentially storing all possible edits, incase 
they need to be retrieved by one of the collaborators. 

For our purposes of designing interfaces for new types of 
publications we group such semi-automated computational 
processes in the digital workflow under the umbrella term 
‘Dynamic Publishing’; they include: layout, multi-format 
conversion, distribution, rightsmanagement, file transfer, 
translation workflows, document updates, payments and 
reading metrics etc. Our aimis to explore these processesin 
rethinking the publication interface. 

As part of Dynamic Publishing and the networked 
publication, privacy has to be addressed asa fundamentalright 
of the reader. Atthe same time, tracking and the reading 
equivalent of the ‘social graph’ are qualities that are very useful. 
Nevertheless, privacy needs to be addressed technically and 
politically. Firstly, metrics dataneeded fora ‘reading graph’ must 
be anonymized. Secondly, access must be allowed to the missing 
matrix of uncreative publishing: the Big Data of reading 
analytics, incorporating the Ngram of reading patterns. 
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Major categories in the publishing infrastucture. 


26 


D0 MONIFS8CELO FOR BOOK LIB9}DEION 


6. INFRASTRUCTURE 
The objective of the Hybrid Publishing Consortium isto 
support Open Source public infrastructures for transmedia, 
multi-format, publishing. This means using structured 
documentarchitectures to output publication formats such as 
EPUB, HTML5, ODT, DOCX, screen PDF and PDF for print-on- 
demandetc. 

Allofthis can be achieved by connecting existing platforms 
and supporting development communities with expertise, 
resources, aknowledge network and by building new 
componentsifthey aremissing. 

Our approach is format agnostic, so platforms can use XML, 
HTML, Markdown, ODT etc. for document markup because we 
would supportan API for interoperability, so long as the formats 
support the required features for what we call ‘Publication Ready 
Outputs’ (PROs). 

APROismade up ofacombination of filetype 
specifications, metadata requirements for the distribution 
channel, as wellas ‘style guides’ for editors and designers for 
creating specific publication components for multi-format 
publishing. The latter includes items suchas tables of contents, 
and front cover and back cover texts. A PRO profile isneeded for 
each output format because one format will not automatically 
translate to another format, e.g. a print book to EPUB. 
Fundamental to this is definition ofa Single Source file, which 
will act as amaster universal documentandalsoa container for 
multiple sources of external data, for example different image 
sizes from an external source for responsive web design, or 
external metadata, suchas bibliographic citations or book trade 
metadata for publishing purposes. 

However this technology stack isnow being superseded by 
javascript technologies creating virtual machines, including 
items such as Node. js, unstructured databases like MongoDB '° 
and real-time display technologies like Meteor. The resultisa 
smoother GUI experience in which content from multiplesource 
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is updated in real-time in the browser. This isamove away from 
theserver-based, client architecture of LAMP. For the Open 
Access community thisis an exciting opportunity becauseit 
means the Open IPR contentrepositories can offer arichresource 
forall types of new publishing models, as well as Virtual Research 
Environments (VRE) forscholars. Open Source examples that 
have used these technologies can be seen in investigative 
journalism suchas DocumentCloud. 
(http://www.documentcloud.org) 


Major Stages in the Workflow 
We have identified six stages in the publishing 
workflow/lifecycle. 


1) Document Validation — writing, authoring and 

structuring 
Validation is required to create the structured documents. An 
interactive feedback GUIis needed to gain the authors’ help to 
make structuring decisions that the validation algorithm cannot 
take on its own. These involve the validation ruleset, structure 
and semantic information: Document layoute.g. headers, bold 
etc; Documentstructure e.g. pagination, chapter etc; and 
Metadata fields and standards for the document. Anexternal 
document editing system will be ableto have ourrule set applied 
toits documents, viaan API. 


Il) Document Editing — text, citations, metadata, 

images and media 
Adding more components to the document on top of the text 
document’s linear string of text means that we need to beable to 
separate out these components, createascheme for their storage, 
and allow access to external dataand media sources. External 
sources would be citations from Zotero, as wellas metadata from 
library system and archive repositories such as Pandora. 
Additionally, revisioning issues are importanthere. 
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Ill) Asset Storage - revisioning, 

meta description 
The publication asset management component of the technology 
stack willusea NoSQL based DBinfrastructure, with metadata 
frameworks MODS and VRA Core4~as used by the Tamboti 
metadescription framework of the Heidelberg Research 
Architecture. 


IV) Layout Design - typesetting and templates: 

semi-automatic and automatic 
This involves the use of Multi-format Templates, whichin 
turnrequires connection to Content Distribution Networks 
(CDN), so that designers can author templates in software and 
graphic design libraries they are familiar with, like Bootstrap. In 
effect this means creating modified bootstrap modules for apps, 
mobile, EPUB etc. Examples would be opensource frameworks 
suchas PugPig (http://pugpig.com)and Famo.us (http://famo.us). 


V) Publishing — multi-formattransformation, 

distribution and remixing 
Multi-format transformation means using our own A- 
machine software eco-system for multi-format transformation. 
Theend publications can then be distributed to POD and digital 
distributors and repositories viaanumber of aggregators. The 
structured document format will make the documents and 
publication available for remixing ata granular level, downto 
specific points in the text or video clips. The formatis designed to 
allowa wide variety of new publication uses. 


VI) Publication Collections — library, bookshop, 
academic and OER repositories 
Firstly, this means supporting the creation of collections of 
publications. It willinclude an API for distributing collection 
with Open Publication Distribution System (OPDS) '' metadata 
forinclusionin other systems. Secondly, itmeans creating easy to 
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deploy real-time web platforms using Meteor and Node.js etc. to 
allow publishers to set up their own libraries, repositories and 
shops. With these two sets of framework options publishers, 
editors, educators and librarians can create custom packages to 
fitinto existing systems or deploy web platforms ifneeded. 


Vil) Transmedia Publishing API 
An Application Program Interface (API) is the way in which our 
systems’ modular components can communicate with other 
systems on the internet securely. This means that the 
functionality we are researching and developing —including 
validation, publication asset structuring, templates, and 
collections—can be integrated into the other systems we are 
connecting to. 
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7. THE PLAN 
Itisimportant to emphasize is that the HPCisnota fixed and 
finalised group and weare only at the beginning of forming the 
network. We want to invite more people to join. The planis for 
long term collaboration with anetwork of stakeholders to 
support Open Source infrastructures for transmedia, multi- 
format, scholarly publishing. The objective is to put areliable 
and trustworthy tool setin front of publishers, so that those 
publishers themselves can then start toinnovate. Thismeansa 
wholesalereplacement of proprietary software applications, 
improvements in Open Source tools, open standards and formats, 
and theintroduction ofnewinteroperablesystems where they do 
not yet exist—for example for micro-payments. We do 
acknowledge that the process towards software maturity is along 
one, and that Open Source is merely a design and engineering 
methodology and nota guarantee of quality. LibreOfficeis an 
example ofsuchatoolin the infrastructure; ithas taken more 
than fourteen years of work for LibreOffice to becomea reliable 
replacement for Microsoft Word, butit isnowstable software. 
LibreOffice is an example of Open Source reverse engineering, of 
figuring out how something works and building aclone. Other 
projects are more ground-breaking and have differentsets of 
challenges, but again wesee that they can become market leaders 
over time—for example the eBook manager Calibre. 

Werecognise that this development and adoption of Open 
Source isa politicalissue which involves policy, economics and 
technology, and which needs multi-stakeholder agreement to 
move technology developments forward. 

Our planis dividedinto three complementary areas of 
activity: research, open learning and ventures. These activities 
would besupported by the formation of two entities, firstly an 
Research and Technology Organisations (RTO), the ‘Hybrid 
Publishing Consortium’, withaseries of academicinstitutions 
and other partners, second by private companies, currently 
including ‘Infomesh Technologies’. 
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Research-current research focuses on issues involved in 
multi-format transformation: the creation ofa structured 
document; layout design issues; and understanding the users and 
theirskill sets in this area ofthe workflow. Research will continue 
in anumber of ways: on dedicated software projects, either as 
collaborations or asnetworks with other academic partners, but 
also in industry contexts and on ventures. Our next areas of focus 
willbe oninteractive validation GUI for structured writing, as 
wellason real-time web GUIs and modular templated designs. 

Open learning—tosupport the Open Source community 
we are developing anumber ofunits dealing with Dynamic 
Publishing, for an open curriculum of Bachelors and Masters 
courses which is being discussed by members of Libre Graphics 
Meeting. This would also beimplemented in consultation with 
The Open Syllabus Project (OSP) of Columbia University. The 
courses would be designed to work with programmes of the UN 
World Summit on the Information Society (WSIS), OER track. 

Ventures -—tosupportthe long term sustainability of 
infrastructure components, projects need to move from research 
and into product focused development as well as support service 
provision. These ventures would be based on projects that are 
developed by the Hybrid Publishing Consortium or by partners. 
Additionally we would develop a series ofregional business hubs 
for localservice provision, and to actas knowledgenetworks for 
technologists and designers to pick up the tools we are 
supporting and run their own ventures. Asan example we 
have joined the Open Invention Network (OIN), '* whichis 
supporting Linux developers and businesses by building a 
collective legal defensive solution against predatory and 
restrictive Patent protection. 
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8.OUR RESEARCH 
We havecombined anumber of methods that fit under the 
umbrella of Design Research, all of which involve knowledge 
production through the process of making. These methods are: 
rapid prototyping; interviews; workflow and lifecycle mapping; 
discourse analysis, specifically TPINK (Technology, Power, 
Ideology, Normativity and Communication); collaborative 
authoring ofmanuals and guides; Somerville’s software 
requirements process; publication forensics; publication 
prototype productions; Open Source software releases and open 
codereview; and stakeholder consultations and knowledge 
network creation. 

Rapid prototyping allows us to test out dynamic publishing 
opportunities and ways ofintegrating software into user 
workflows. Our findings demonstrate the need fora technical 
tool that lets publishers’ workflows adapt easily to the demands 
of multi-format publishing. The rapid prototyping projects fall 
into two categories. The first is infrastructure software designin 
the area of multi-format, single source publishing 
transformation engines. Secondisa series of publisher 
prototypes, which means working with publishers to make 
examples of digital publication productions. 

Our research wasinitially outlinedinaresearch planin 
2012. Thisruns until 2015 and will then bereviewed with new 
priorities in order to run fora further three years. The 2012 
research plan can be found here: 

Dynamic Publishing—New Platforms, New Readers! 
http://www.consortium.io/research-plan 
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Infrastructure 

TypeSetr —isan Open Source publishing software 
component for multi-format document conversion. It produces 
the following formats with automatic templated design layouts 
and format conversion: EPUB 3.0, HTML5, PDF and others. 
Code- https://github.com/consortium/typesetr-converter 
Demo- https://typesetr.consortium.io 

A-machine-isasoftware ecology supplied by different 
providers to complete the major steps in the publishing workflow 
and connect our existing document structuring work. This 
includes: meta description frameworks; layout template designs; 
distribution and sales; Open Access and OER formatting; 
validation ete. http://a-machine.net 


Publisher Prototypes 

* Merve Remix—in which we digitised Merve Verlag’s 
back catalogue of 150 titles, and are making it available online 
for remixing. http://merve.consortium.io 
http://www.consortium.io/merve-remix 

* Museums and Post-digital Publishing—with 
Fotomuseum Winterthur, and the publication Manifeste! In 
which we examine how the high quality museum catalogue can 
be digitised and taken into open learning and other contexts. 
http://www.consortium.io/fotomuseum 

* Traces of McLuhan-a Media Sprint atthe Marshall 
McLuhan Salon - McLuhan archive, where we createa 
transmedia trace ofauser’s journey through thearchive, using 
Heidelberg Universities Tamboti platform anda second archive 
platform Pandora. http://www.consortium.io/traces-mcluhan 

* Moos Verlag—where we look toengageanew 
community with a 1970s urbanism publishing collection. This 
involves book scanning and re-publishing titles free online. 
http://consortium.io/moos-verlag 
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http://www.librarything.de/catalog/moosbeat&tag=Urban%2BD 


esign 


Publication Forensics 
Hereitisimportant to bear in mind our objective of making 
software infrastructures for publishing that are reliable, based on 
a free-flow of knowledge using Open Source methodologies, and 
cost effective for publishers. There are three components to this 
software design process: understanding the real world problem, 
making animaginative leap and, finally, a precise weaving of the 
first two into the material of the process in order to reduce 
ambiguity through aniterative requirements building process. 
This is where ‘Publication Forensics’ hasemergedasapracticein 
our research. So farithas guided several key projects. These have 
included immersion in hundreds of volumes of the Merve Verlag 
back catalog and manually reconstructing scanned texts back 
intoa data object semantically resembling a book. Also notable 
has been compiling a lexicon ofall scholarly publishing types 
known inside WikiPedia—Festschrift, Gloss, Leak, Liquid book, 
Ted Booketc. 
https: //en.wikipedia.org/wiki/User:Mrchristian%5CBooks%5CA_ 


Publication_Taxonomy 
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Research Publications 
Wehaveestablisheda publishing programme fora number of 
reports, special dossiers, ‘good practices’ guides, manuals and 
reference materials, all of whichcan found on our GitHub 
repository as hybrid publications. 
https://github.com/consortium/hybrid-publishing-research 
1.APublication Taxonomy, 2014. 
https://github.com/consortium/publication-taxonomy 
2. Book Scanning & Book Scanning Manual, 2014. 
http://bookscanner.consortium.io 
3. Structured document writing manual and style guide, 2015. 
4. Workflows/lifecycles—see example ‘periodic table’ below, 
2015. 
5. Publication Ready Outputs -— definitions and guides for multi- 
format publication output targets and design issues 
6. Standards guide to structure documents for multi-format 
publishing, 2015. 
7.Technicalmaps of publishing infrastructure software and 
systems, 2015. 
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Digital Publishing Periodic Table, 
Simon Worthington, 2012. 


OUR RESEARCH 
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9. CONCLUSION 
The idea of the free circulation of knowledge guides the HPC 
and helps bind partners together in our collaborations. This 
is asuitable moment to reflect upon the Open Access (OA) 
movementin academic publishing and its progress in book 
liberation since the Budapest Open Access Initiative (BOAT) 
was launched in 2002. '° OAhas since been adopted as the 
normin many jurisdictions, but vested interests are creating 
confusion and political difficulties. As recently as last year 
the Netherlands government tooka hard line with Elsevier, 
by withdrawing all payment tothe company unless it 
complied with the government’s OA policies. Elsewhere, the 
UKhas opted for sweeteners to the publishing industry under 
the Gold Open Access scheme advocated by the Finch 


t, 4 


Report, © under which researchers pay publishers for the 


right to publish through Open Access. Meanwhile, over in the 
US Elsevier pays lobbyists to tighten research copyright '° 
(Lessig), which has lead Harvard Magazine to label academia 
‘The Wild West’! ° (Harvard Magazine) of publishing. What is 
important to keep in mind isnot the staggering annual profits 
corporations make from publishing, although in the case of 
Reed Elsevier this is £826 million per annum (201 3)!" from 
academic publishing, specifically, Science Technology and 
Medicine (STM) - straight out ofthe public purse. The real 
issue is thathuman knowledge cannot be shared and used to 
benefit humankind, because itis important to remember that 
the result of payment of this near-1€ billion annually is that 
only arelative handful of people can read or use academic 
publications. 

Moving onto looking at publishing in general to ask the 
question of how Open Access (AKA book liberation) can be 
mapped onto this varied and large industry is complex. In the 
EU alone publishing is the largest creative industry, with an 
annual turnover in the region of 23€ billion (2009).'® 
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A functioning and equitable economy isneeded to support 
‘free-at-the-point-of-reading’ and ‘free-to-re-use’ publishing 
models. Corporate capitalism does nothing but skim offthe 
profitable parts of the stack while imposing distribution 
monopolies, to leave the bulk of publishers to live with the 
constant drone of ‘start-up’, ‘entrepreneurialism’ and 
‘disruption’, as their only strategies for finding some 
imagined and as yet unknown economic model - an ever 
elusive Eldorado. These capitalistic mantras, however 
thin they might wear, still keep the thinking on these 

issues confined. 

The Book Liberation Manifesto suggests two ways forward 
for publishing. Firstly, a redistribution of the profits by the top 
earners of the publishing industry to the lower rungs or for those 
top earners to pay for the release of publications into public 
circulation. Second, for Open Source publishing software to be 
treated as infrastructure and to receive the same funding that 
national broadcast networks receive, or for it to be maintained 
and enhanced in the way other kinds of basic infrastructure 
provision are supported. Theresult ofsuch infrastructures 
providing low cost ways of reaching publics via digital channels 
would be that publishers could afford to experiment and 
innovate. Ironically Charles Babbage, the inventor of the first 
mechanical computer, identified these capitalistic traits and 
their limiting effect on publishing nearly two centuries agoin 
1818, ina book chapter entitled ‘On Combinations of Masters 
Against the Public’. '° What Babbage showed, via detailed 
calculations of labour and materials, was that publishers were 
falsely inflating the price of books, putting them out of reach of 
thecommon people. Tonoone’s surprise his book was banned by 
the publishing trade. Now that the descendants of Babbage’s 
Difference Engine’ are at our fingertips in the form of the 
modern computer it is time to take alead from the computer 
scientist Alan Kay: 
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‘The best way 
to predictthe 
future is to 
invent it. 


Alan oe atameetin ngo of PARC, 
Palo Alto Research Center.” 
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About the Hybrid Publishing Consortium 
The Hybrid Publishing Consortium is a research group which 
supports Open Source software for publishing infrastructures. 

The Hybrid Publishing Consortium model of software is 
built on user research and rapid prototyping. The architectural 
approachis modular, back end orientated, with an emphasis on 
application frameworks, ISO standards and interoperability 
between services and providers, as opposed to creating stand- 
alone web facing applications. 

The Hybrid Publishing Consortium isa research group of 
the Hybrid Publishing Lab in collaboration with partners and 
associates. The Hybrid Publishing Lab is part ofthe Leuphana 
University of Liineburg Innovations-Inkubator, financed by the 
European Regional Development Fund and co-funded by the 
German federal state of Lower Saxony. Asan business incubator 
our researchis conducted withindustry partners and we are 
supported to create newstartup business ventures. Currently 
HPC has onestartup InfoMesh UG whichis specialising in MLA 
(Museums, Libraries and Archives) publishing. 
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Software 
A-machine 
Sublime Text 2 
Scribus 
Bootstrap 
Transpect 
Calibre 
Javascripts 

SaaS 
Google Docs 
Github 


Standards formal and de 
jure (ISO, W3C, IETF, UTR etc) 


HTML5 

cSs 

EPUB 2.1 

UPUB3.0 

PDF 

Dublin Core 

BICS 

BISG 

ISBN 

XML 

SHA 
Standards de facto 

ORCHID 


Print format - ‘A’ format 


‘pocket’ size 
178x11lmm 
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Fonts 


Charis SIL - SIL 
(originally known as 
theSummer Institute 
of Linguistics, Inc.) 
Work Sans 


Distribution & platforms 


Anagram via 
Mute Publishing 
Metamute.org 
A-machine - Research 
Viewer 
Lightningsource 
US/UK Printing 
Ingram Advance 
Catalog 

Nielsen UK Pub Web 
Amazon Pro Seller 
Amazon Kindle 
Ingram Spark 
AppleiBooks 
Google Play 

Issuu 

Github 

Aaaaarg 
Archive.org 
OpenLibrary 
LibraryThing 
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The Book Liberation Manifesto is an exploration of 
publishing outside of current corporate constraints and 
beyond the confines of book piracy. We believe that 
knowledge should bein free circulation to benefit 
humankind, which means an equitable and vibrant 
economy to support publishing, instead ofthe prevailing 
capitalist hand-me-down system ofSisyphean economic 
sustainability. Readers and books have been forced into 
pirate libraries, while sales channels have been 
monopolised by the big Internet giants which exact 
extortionate fees from publishers. We have three proposals. 
First, publications should be free-at-the-point-of-reading 
undera variety ofopen intellectual property regimes. 
Second, they should become fully digital —in order to 
facilitate ready reuse, distribution, algorithmic and 
computational use. Finally, Open Source software for 
publishing should be treated as public infrastructure, 
with sustained research and investment. The result of 
such robust infrastructures will mean lower costs for 
manufacturing and faster publishing lifecycles, so that 
publishers and publics will be more readily able to afford 
toinvent new futures. 
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